Skip to content

Conversation

@village-way
Copy link
Contributor

@village-way village-way commented Jun 2, 2025

Related GitHub Issue

Closes: #4217

Description

  • Model-generated output appears first
  • An "initial checkpoint" indicator shows up
  • Previously generated content reappears, causing duplication
  • This happens when the model responds quickly but checkpoint initialization is slow

Test Procedure

Type of Change

  • 🐛 Bug Fix: Non-breaking change that fixes an issue.
  • New Feature: Non-breaking change that adds functionality.
  • 💥 Breaking Change: Fix or feature that would cause existing functionality to not work as expected.
  • ♻️ Refactor: Code change that neither fixes a bug nor adds a feature.
  • 💅 Style: Changes that do not affect the meaning of the code (white-space, formatting, etc.).
  • 📚 Documentation: Updates to documentation files.
  • ⚙️ Build/CI: Changes to the build process or CI configuration.
  • 🧹 Chore: Other changes that don't modify src or test files.

Pre-Submission Checklist

  • Issue Linked: This PR is linked to an approved GitHub Issue (see "Related GitHub Issue" above).
  • Scope: My changes are focused on the linked issue (one major feature/fix per PR).
  • Self-Review: I have performed a thorough self-review of my code.
  • Code Quality:
    • My code adheres to the project's style guidelines.
    • There are no new linting errors or warnings (npm run lint).
    • All debug code (e.g., console.log) has been removed.
  • Testing:
    • New and/or updated tests have been added to cover my changes.
    • All tests pass locally (npm test).
    • The application builds successfully with my changes.
  • Branch Hygiene: My branch is up-to-date (rebased) with the main branch.
  • Documentation Impact: I have considered if my changes require documentation updates (see "Documentation Updates" section below).
  • Changeset: A changeset has been created using npm run changeset if this PR includes user-facing changes or dependency updates.
  • Contribution Guidelines: I have read and agree to the Contributor Guidelines.

Screenshots / Videos

Documentation Updates

Additional Notes

Get in Touch


Important

Ensures checkpoint initialization completes before starting the main loop in Task.ts to prevent content duplication in ChatView.

  • Behavior:
    • Ensures checkpoint initialization completes before starting the main loop in initiateTaskLoop() in Task.ts to prevent content duplication.
    • Introduces ensureCheckpointInitialization() in Task.ts to handle checkpoint initialization with a timeout and error handling.
  • Misc:
    • Minor formatting changes in sayAndCreateMissingParamError() and resumeTaskFromHistory() in Task.ts.

This description was created by Ellipsis for d08a8ea26f5d79dce7da89a59bf454aa852b7bc5. You can customize this summary. It will automatically update as commits are pushed.

@village-way village-way requested review from cte and mrubens as code owners June 2, 2025 08:01
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. bug Something isn't working labels Jun 2, 2025
@daniel-lxs daniel-lxs moved this from Triage to PR [Needs Prelim Review] in Roo Code Roadmap Jun 2, 2025
hannesrudolph

This comment was marked as duplicate.

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jun 6, 2025
Copy link
Collaborator

@hannesrudolph hannesrudolph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Fix async checkpoint initialization causing duplicate output in ChatView

Reviewed by Claude Code (claude.ai/code)

Overview

This PR addresses a race condition where model-generated output could appear, followed by checkpoint initialization messages, and then previously generated content would reappear, causing duplication in the ChatView. The fix ensures checkpoint initialization completes before starting the main task loop.

✅ Strengths

Problem Identification:

  • Clearly identifies a race condition between fast model responses and slow checkpoint initialization
  • Well-documented issue (#4217) that addresses a real UX problem
  • Good understanding of the asynchronous timing issue

Implementation Approach:

  • Targeted fix that addresses the root cause (timing) rather than symptoms
  • Maintains existing checkpoint functionality while fixing the race condition
  • Proper error handling and timeout mechanisms
  • Graceful degradation by disabling checkpoints if initialization fails

Code Quality:

  • Well-documented method with clear JSDoc explaining the purpose
  • Proper async/await usage with error handling
  • Good use of existing patterns (pWaitFor, timeout handling)
  • Maintains backward compatibility

🔍 Technical Implementation

Key Changes:

  1. initiateTaskLoop(): Changed from background initialization to awaited initialization
  2. ensureCheckpointInitialization(): New method with timeout and error handling
  3. Minor formatting: Cleaned up template literals (non-functional)

Race Condition Fix:
Before: Background initialization (race condition) with getCheckpointService(this)
After: Synchronized initialization with await this.ensureCheckpointInitialization()

✅ Code Quality Assessment

Async Handling:

  • Proper use of pWaitFor with reasonable timeout (10 seconds)
  • Good interval checking (100ms) for responsiveness
  • Early exit on task abort to prevent hanging

Error Handling:

  • Try-catch wraps the entire initialization process
  • Graceful fallback: disables checkpoints on failure/timeout
  • Appropriate logging for debugging

Performance Considerations:

  • 10-second timeout prevents indefinite hanging
  • 100ms polling interval balances responsiveness vs CPU usage
  • Early abort detection prevents unnecessary waiting

⚠️ Potential Considerations

  1. Performance Impact:

    • Tasks now wait for checkpoint initialization before starting
    • Could add up to 10 seconds delay in worst-case scenarios
    • However, this prevents the more serious UX issue of duplicate content
  2. Error Scenarios:

    • If checkpoint service creation fails, checkpoints are disabled
    • This is appropriate behavior but users lose checkpoint functionality
    • Logging helps with debugging but users might not see the issue
  3. Edge Cases:

    • Checkpoint service already initialized → Quick return ✅
    • enableCheckpoints disabled during wait → Exits loop ✅
    • Task abortion during initialization → Proper cleanup ✅

📋 Testing Considerations

Manual Testing Needed:

  • Fast model responses with slow checkpoint initialization
  • Checkpoint initialization timeout scenarios
  • Task abortion during checkpoint initialization
  • Normal checkpoint functionality still works

Unit Testing Opportunities:

  • Test timeout behavior
  • Test abort handling during initialization
  • Test error handling when checkpoint service fails

💡 Minor Suggestions (Non-blocking)

  1. Enhanced Logging: Consider adding start log for better debugging
  2. Configuration: Consider making timeout configurable (currently 10000ms)

Security & Performance

  • ✅ No security implications
  • ⚠️ Potential performance impact: adds up to 10s delay in worst case
  • ✅ No memory leaks or resource issues
  • ✅ Proper cleanup on abort/timeout

Overall Assessment

LGTM

This is a well-implemented fix for a legitimate race condition issue. The solution:

  • Correctly identifies and fixes the root cause
  • Implements proper error handling and timeouts
  • Maintains backward compatibility
  • Provides graceful degradation
  • Includes appropriate logging for debugging

Trade-offs Analysis:

  • Adds potential startup delay vs. eliminates duplicate content UX issue
  • The trade-off is appropriate - better to wait briefly than show confusing duplicate content

Risk Assessment: Low to Medium

  • Low risk for functionality (graceful fallbacks)
  • Medium impact on performance (potential 10s delay)
  • High benefit for UX (eliminates confusing duplicate content)

The implementation follows good async patterns and handles edge cases well. The minor suggestions above are not blockers for approval. This is a solid fix that meaningfully improves the user experience.

Copy link
Member

@daniel-lxs daniel-lxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @village-way, Thanks for your contribution. The core fix looks solid.

I've left a few suggestions for improvements:

  1. Consider making the timeout configurable
  2. Minor code cleanup opportunity in error handling
  3. Formatting changes that could be separated

Overall, this is a good solution to the race condition issue.

@daniel-lxs daniel-lxs moved this from PR [Needs Prelim Review] to PR [Changes Requested] in Roo Code Roadmap Jun 6, 2025
@village-way village-way requested a review from jr as a code owner June 7, 2025 03:01
@daniel-lxs daniel-lxs moved this from PR [Changes Requested] to PR [Needs Review] in Roo Code Roadmap Jun 16, 2025
@daniel-lxs daniel-lxs moved this from PR [Needs Review] to PR [Needs Prelim Review] in Roo Code Roadmap Jun 16, 2025
@daniel-lxs
Copy link
Member

Hey @village-way, it seems that this implementation is causing all the checkpoints to be "Initial Checkpoints" instead of the usual "Checkpoint". While they seem to work, it might be confusing for users:

image

Here’s how a normal checkpoint looks:

image

I'm not sure what's going on or if I'm doing something wrong.

@daniel-lxs daniel-lxs moved this from PR [Needs Prelim Review] to PR [Changes Requested] in Roo Code Roadmap Jun 16, 2025
@village-way
Copy link
Contributor Author

Hey @village-way, it seems that this implementation is causing all the checkpoints to be "Initial Checkpoints" instead of the usual "Checkpoint". While they seem to work, it might be confusing for users:

image

Here’s how a normal checkpoint looks:

image

I'm not sure what's going on or if I'm doing something wrong.

I can't reproduce the issue you're describing. On my side, the checkpoint information displayed on the ChatView page appears normal — the first one is labeled "Initial Checkpoint," and the subsequent ones are just "Checkpoint." Also, we haven’t made any changes to the display logic in the code, so I'm quite curious about how this issue occurred. Was it reproduced based on our PR, or could it be due to a recent version merging our commit? I tested it using the branch from our PR and didn’t encounter the problem. If possible, please provide more details so I can investigate further.

image

@hannesrudolph hannesrudolph moved this from PR [Changes Requested] to PR [Needs Prelim Review] in Roo Code Roadmap Jun 21, 2025
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Jun 21, 2025
Copy link
Member

@daniel-lxs daniel-lxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @village-way, I was able to confirm it is probably an issue on my end since I see the duplicate "Initial Checkpoint" on the main branch as well.

LGTM

@daniel-lxs daniel-lxs moved this from PR [Needs Prelim Review] to PR [Needs Review] in Roo Code Roadmap Jun 22, 2025
@mrubens
Copy link
Collaborator

mrubens commented Jun 22, 2025

Have we considered the approach where we keep the initial checkpoint async but insert the checkpoint message at the top of the chat instead of splitting the message in progress with it?

Basically I'm unsure about the cost of disabling checkpoints for cases where it takes longer than 10 seconds to initialize (as well as adding a blocking delay at the beginning of the task).

@daniel-lxs daniel-lxs moved this from PR [Needs Review] to PR [Changes Requested] in Roo Code Roadmap Jun 22, 2025
@daniel-lxs
Copy link
Member

daniel-lxs commented Jun 22, 2025

So basically fixing the issue form the UI side rather than changing the behavior of the checkpoints.

We can explore this solution a bit more before deciding, in the meantime I'll move this PR to "In Progress" to clean up a bit.

@daniel-lxs daniel-lxs moved this from PR [Changes Requested] to PR [Draft / In Progress] in Roo Code Roadmap Jun 22, 2025
@daniel-lxs daniel-lxs marked this pull request as draft June 22, 2025 14:54
@village-way
Copy link
Contributor Author

Have we considered the approach where we keep the initial checkpoint async but insert the checkpoint message at the top of the chat instead of splitting the message in progress with it?

Basically I'm unsure about the cost of disabling checkpoints for cases where it takes longer than 10 seconds to initialize (as well as adding a blocking delay at the beginning of the task).

When I first started working on the implementation, I also followed this approach. However, after several attempts, I still couldn’t resolve the UI rendering issue. The logic for UI rendering and streaming data processing is relatively complex. It would be ideal if we could maintain asynchronous logic while also solving this problem. Since this issue significantly affects the user experience, especially when disk performance is poor but the server responds quickly, I temporarily adopted a synchronous approach to address it. I hope there is a better solution. Thank you.

@hannesrudolph
Copy link
Collaborator

stale

@github-project-automation github-project-automation bot moved this from PR [Draft / In Progress] to Done in Roo Code Roadmap Jul 7, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Jul 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working lgtm This PR has been approved by a maintainer PR - Draft / In Progress size:L This PR changes 100-499 lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

the race condition between checkpoint initialization and model output streaming, result to duplicate output in ChatView

4 participants